Using Formatted Input and Output
Formatted input and output can be used with the PRINT , READ, and STRING functions, with the IDL_Variable::toString method, or within template literal strings. IDL supports two different syntaxes for format strings: a FORTRAN style and a C printf-style.
Note: IDL uses the standard I/O function snprintf
to do its formatting. Different operating systems may produce slight differences in the output.
Examples
Use the FORTRAN style to print out three numbers surrounded by brackets:
PRINT, FINDGEN(3), FORMAT = '("The values are: [", 3(" ",f0), "]")'
Now using the C printf-style:
PRINT, FINDGEN(3), FORMAT = 'The values are: [ %f %f %f]'
In both cases IDL prints:
The values are: [ 0.000000 1.000000 2.000000]
You can see that the FORTRAN style is less readable but is better for outputting multiple values, while the C-printf style is easier to read but is more verbose for multiple values.
You can also use formatted output with template literal strings, using either FORTRAN or C-printf style formats:
x = [1:5]
print, `The values are: ${x,"%5d"}`
print, `The values are: ${x,"(i5)"}`
In both cases IDL prints:
The values are: [ 1, 2, 3, 4, 5]
See the bottom for more examples.
FORTRAN-Style Format Strings
Syntax
Format strings using the FORTRAN style have the form:
FORMAT = '(q1f1s1f2s2 ... fnqn)'
where q, f, and s are described below.
The FORTRAN-style format string must begin and end with parentheses.
Record Terminators
The q is zero or more slash (/) record terminators. On output, each record terminator causes the output to move to a new line. On input, each record terminator causes the next line of input to be read.
Format Codes
The f is a format code that specifies how data should be transferred. The code f can also be a nested format specification enclosed in parentheses, called a group specification:
...n(q1f1s1f2s2 ... fnqn)...
A group specification consists of an optional repeat count n
followed by a format specification enclosed in parentheses. For example, the format specification:
FORMAT = '("Result: ", "<",I5,">", "<",I5,">")'
can be written more concisely using a group specification:
FORMAT = '("Result: ", 2("<",I5,">"))'
If the repeat count is 1 or is not given, the parentheses serve only to group format codes for use in format reversion (discussed in the next section).
See FORTRAN-Style Format Codes for a list of available format codes.
Field Separators
s is a field separator. A field separator consists of one or more commas (,) and/or slash record terminators (/). The only restriction is that two commas cannot occur side-by-side.
The arguments provided in a call to a formatted input/output routine are called the argument list. The argument list specifies the data to be moved between memory and the file. Note that arrays are considered to be a collection of scalar data elements, and IDL structures are processed in terms of their individual tag values. Complex scalar values are treated as two floating-point values.
Processing Rules for FORTRAN-Style Format Strings
IDL uses the following rules to process FORTRAN-style format strings:
- Traverse the format string from left to right, processing each record terminator and format code until an error occurs or no data is left in the argument list. The comma field separator serves no purpose except to delimit the format codes.
- When a slash record terminator (/) is encountered, the current record is completed, and a new one is started. For output, this means that a new line is started. For input, it means that the rest of the current input record is ignored, and the next input record is read.
-
When a format code that does not transfer data is encountered, process it according to its meaning.
-
When a format code that transfers data to or from the argument list is encountered, it is matched up with the next datum in the argument list.
- On input, read data from the file and format it according to the format code. If the data type of the input data does not agree with the data type of the input variable, do type conversion to match the variable if possible; otherwise, issue a type conversion error and stop.
- On output, write the data according to the format code. If the data type does not agree with the format code, do the type conversion prior to doing the output if possible. If the type conversion is not possible, issue a type conversion error and stop.
- If the last closing parenthesis of the format string is reached and there are no data left on the argument list, then terminate. If, however, there are still data to be processed on the argument list, then part or all of the format specification is reused. This process is called format reversion.
Format Reversion
In format reversion, the current record is terminated, a new one is initiated, and format control reverts to the group repeat specification whose opening parenthesis matches the next-to-last closing parenthesis of the format string. If the format does not contain a group repeat specification, format control returns to the initial opening parenthesis of the format string. For example, the IDL command:
PRINT, FORMAT = '("The values are: ", 2("<", I1, ">"))', $
INDGEN(6)
results in the output
The values are: <0><1>
<2><3>
<4><5>
The process involved in generating this output is as follows:
- Output the string “The values are: ”.
- Process the group specification and output the first two values. The end of the format specification is encountered, so end the output record. Data are remaining, so move back to the group specification
2("<", I1, ">")
. - Repeat step 2 until no data remain.
- End the output record.
C printf-Style Format Strings
Syntax
Format strings using the C printf-style have the form:
FORMAT = 'Some text %f more text %f etc'
where f is the format code, described below.
Note: IDL distinguishes between the two different formats (FORTRAN or C printf) by whether the format string begins with a parenthesis or not. If there is no parenthesis then IDL assumes it is a C printf-style format.
Tip: If you need a parenthesis at the beginning of a C printf-style format string, you should escape the parenthesis with a backslash character (\).
Format Codes
See C printf-Style Format Codes for a list of available format codes.
Processing Rules for C printf-Style Format Strings
IDL uses the following rules to process C printf-style format strings:
- Traverse the format string from left to right, processing each format code until an error occurs or no data is left in the argument list.
-
Process any ordinary characters that are encountered. For output, print out the characters.
-
When a format code that transfers data to or from the argument list is encountered, it is matched up with the next datum in the argument list.
- On input, read data from the file and format it according to the format code. If the data type of the input data does not agree with the data type of the input variable, do type conversion to match the variable if possible; otherwise, issue a type conversion error and stop.
- On output, write the data according to the format code. If the data type does not agree with the format code, do the type conversion prior to doing the output if possible. If the type conversion is not possible, issue a type conversion error and stop.
Examples
Reading Formatted Table Data
IDL explicitly formatted input/output has the power and flexibility to handle almost any kind of formatted data. A common use of explicitly formatted input/output involves reading and writing tables of data. Consider a data file containing employee data records. Each employee has a name (String, 32 columns) and the number of years they have been employed (Integer, 3 columns) on the first line. The next two lines contain each employee’s monthly salary for the last twelve months. A sample file named employee.dat with this format might look like the following:
Bullwinkle 10
1000.0 9000.97 1100.0 2000.0
5000.0 3000.0 1000.12 3500.0 6000.0 900.0
Boris 11
400.0 500.0 1300.10 350.0 745.0 3000.0
200.0 100.0 100.0 50.0 60.0 0.25
Natasha 10
950.0 1050.0 1350.0 410.0 797.0 200.36
2600.0 2000.0 1500.0 2000.0 1000.0 400.0
Rocky 11
1000.0 9000.0 1100.0 0.0 0.0 2000.37
5000.0 3000.0 1000.01 3500.0 6000.0 900.12
The following IDL statements read data with the above format and produce a summary of the contents of the file:
;Open data file for input.
OPENR, 1, 'employee.dat'
;Create variables to hold the name, number of years, and monthly
;salary.
name = '' & years = 0 & salary = FLTARR(12)
;Output a heading for the summary.
PRINT, FORMAT='("Name", 28X, "Years", 4X, "Yearly Salary")'
;Note: The actual dashed line is longer than is shown here.
PRINT, '========'
;Loop over each employee.
WHILE (~ EOF(1)) DO BEGIN
;Read the data on the next employee.
READF, 1, $
FORMAT = '(A32,I3,2(/,6F10.2))', name, years, salary
;Output the employee information. Use TOTAL to sum the monthly
;salaries to get the yearly salary.
PRINT, FORMAT='(A32,I5,5X,F10.2)', name, years, TOTAL(salary)
ENDWHILE
CLOSE, 1
The output from executing these statements on employee.dat is as follows:
Name Years Yearly Salary
======================================================
Bullwinkle 10 32501.09
Boris 11 6805.35
Natasha 10 14257.36
Rocky 11 32500.50
Reading Records With Multiple Array Elements
Frequently, data are written to files with each record containing single elements of more than one array. One example might be a file consisting of observations of altitude, pressure, temperature, and velocity with each line or record containing a value for each of the four variables. Because IDL has no equivalent of the FORTRAN implied DO list, special procedures must be used to read or write this type of file.
The first approach, which is the simplest, may be used only if all of the variables have the same data type. An array is created with as many columns as there are variables and as many rows as there are elements. The data are read into this array, the array is transposed storing each variable as a row, and each row is extracted and stored into a variable which becomes a vector. For example, the FORTRAN program which writes the data and the IDL program which reads the data are as follows:
FORTRAN Write:
DIMENSION ALT(100),PRES(100),TEMP(100),VELO(100)
OPEN (UNIT = 1, STATUS='NEW', FILE='TEST')
WRITE(1,'(4(1x,g15.5))')
(ALT(I),PRES(I),TEMP(I),VELO(I),I=1,100)
IDL Read:
;Open file for input.
OPENR, 1, 'test'
;Define variable (NVARS by NOBS).
A = FLTARR(4,100)
;Read the data.
READF, 1, A
;Transpose so that columns become rows.
A = TRANSPOSE(A)
;Extract the variables.
ALT = A[*, 0]
PRES = A[*, 1]
TEMP = A[*, 2]
VELO = A[*, 3]
Note that this same example may be written without the implied DO list, writing all elements for each variable contiguously and simplifying matters considerably:
FORTRAN Write:
DIMENSION ALT(100),PRES(100),TEMP(100),VELO(100)
OPEN (UNIT = 1, STATUS='NEW', FILE='TEST')
WRITE (1,'(4(1x,G15.5))') ALT,PRES,TEMP,VELO
IDL Read:
;Define variables.
ALT = FLTARR(100)
PRES = ALT & TEMP = ALT & VELO = ALT
OPENR, 1, 'test'
READF, 1, ALT, PRES, TEMP, VELO
A different approach must be taken when the columns contain different data types or the number of lines or records are not known. This method involves defining the arrays, defining a scalar variable to contain each datum in one record, then writing a loop to read each line into the scalars, and then storing the scalar values into each array. For example, assume that a fifth variable, the name of an observer which is of string type, is added to the variable list. The FORTRAN output routine and IDL input routine are as follows:
FORTRAN Write:
DIMENSION ALT(100),PRES(100),TEMP(100),VELO(100)
CHARACTER * 10 OBS(100)
OPEN (UNIT = 1, STATUS = 'NEW', FILE = 'TEST')
WRITE (1,'(4(1X,G15.5),2X,A)')
(ALT(I),PRES(I),TEMP(I),VELO(I),OBS(I),I=1,100)
IDL Read:
;Access file. Read files containing from 1 to 200 records.
OPENR, 1, 'test'
;Define vector, make it large enough for the biggest case.
ALT = FLTARR(200)
;Define other vectors using the first.
PRES = ALT & TEMP = ALT & VELO = ALT
;Define string array.
OBS = STRARR(200)
;Define scalar string.
OBSS = ''
;Initialize counter.
I = 0
WHILE ~ EOF(1) DO BEGIN
;Read scalars.
READF, 1, $
FORMAT = '(4(1X, G15.5), 2X, A10)', $
ALTS, PRESS, TEMPS, VELOS, OBSS
;Store in each vector.
ALT[I] = ALTS & PRES[I] = PRESS & TEMP[I] = TEMPS
VELO[I] = VELOS & OBS[I] = OBSS
;Increment counter and check for too many records.
IF I LT 199 THEN I = I + 1 ELSE STOP, 'Too many records'
ENDWHILE
If desired, after the file has been read and the number of observations is known, the arrays may be truncated to the correct length using a series of statements similar to the following:
ALT = ALT[0:I-1]
The above statement represents a worst case example. Reading is greatly simplified by writing data of the same type contiguously and by knowing the size of the file. One frequently used technique is to write the number of observations into the first record so that when reading the data the size is known.
Note: It might be tempting to implement a loop in IDL which reads the data values directly into array elements, using a statement such as the following:
FOR I = 0, 99 DO READF, 1, ALT[I], PRES[I], TEMP[I], VELO[I]
This statement is incorrect. Subscripted elements (including ranges) are temporary expressions passed as values to procedures and functions (READF in this example). Parameters passed by value do not pass results back to the caller. The proper approach is to read the data into scalars and assign the values to the individual array elements as follows:
A = 0. & P = 0. & T = 0. & V = 0.
FOR I = 0, 99 DO BEGIN
READF, 1, A, P, T, V
ALT[I] = A & PRES[I] = P & TEMP[I] = T &
VELO[I] = V
ENDFOR